Coupled Data Assimilation for Integrated 
Earth System Analysis and Prediction: 
Goals, Challenges, and Recommendations 


Authors: Stephen G. Penny (UMD, NOAA/NCEP), Santha Akella (NASA/GMAO), Mark 
Buehner (ECCC), Matthieu Chevallier (CNRM; WWRP/PPP), Francois Counillon (NERSC), 
Clara Draper (NASA/GMAO), Sergey Frolov (NRL), Yosuke Fujii (JMA/MRJ), Alicia 
Karspeck (NCAR), Arun Kumar (NOAA/NCEP), Patrick Laloyaux (ECMWF), Jean-Francois 
Mahfouf (Meteo-France), Matthew Martin (UK MetOffice), Malaquias Pefia (NOAA/NCEP), 
Patricia de Rosnay (ECMWF), Aneesh Subramanian (University of Oxford), Robert Tardif 
(University of Washington), Xingren Wu (NOAA/NCEP). 


Reviewers: Jeff Anderson (NCAR), Alberto Carrassi (NERSC), Eugenia Kalnay (UMD), Daryl 
Kleist (NOAA/NCEP), Ricardo Todling (NASA/GMAO). 


Executive Summary 


The purpose of this report is to identify fundamental issues for coupled data assimilation (CDA), 
such as gaps in science and limitations in forecasting systems, in order to provide guidance to the 
World Meteorological Organization (WMO) on how to facilitate more rapid progress 
internationally. 


Coupled Earth system modeling provides the opportunity to extend skillful atmospheric forecasts 
beyond the traditional two-week barrier by extracting skill from low-frequency state components 
such as the land, ocean, and sea ice. More generally, coupled models are needed to support 
seamless prediction systems that span timescales from weather, subseasonal to seasonal (S2S), 
multiyear, and decadal. Therefore, initialization methods are needed for coupled Earth system 
models, either applied to each individual component (called Weakly Coupled Data Assimilation 
- WCDA) or applied the coupled Earth system model as a whole (called Strongly Coupled Data 
Assimilation - SCDA). Using CDA, in which model forecasts and potentially the state estimation 
are performed jointly, each model domain benefits from observations in other domains either 
directly using error covariance information known at the time of the analysis (SCDA), or 
indirectly through flux interactions at the model boundaries (WCDA). Because the 
non-atmospheric domains are generally under-observed compared to the atmosphere, CDA 
provides a significant advantage over single-domain analyses. 


Next, we provide a synopsis of goals, challenges, and recommendations to advance CDA: 


Goals: (a) Extend predictive skill beyond the current capability of NWP (e.g. as demonstrated by 
improving forecast skill scores), (b) produce physically consistent initial conditions for coupled 
numerical prediction systems and reanalyses (including consistent fluxes at the domain 
interfaces), (c) make best use of existing observations by allowing observations from each 
domain to influence and improve the full earth system analysis, (d) develop a robust 
observation-based identification and understanding of mechanisms that determine the variability 
of weather and climate, (e) identify critical weaknesses in coupled models and the earth 
observing system, (f) generate full-field estimates of unobserved or sparsely observed variables, 
(g) improve the estimation of the external forcings causing changes to climate, (h) transition 
successes from idealized CDA experiments to real-world applications. 


Challenges: (a) Modeling at the interfaces between interacting components of coupled Earth 
system models may be inadequate for estimating uncertainty or error covariances between 
domains, (b) current data assimilation methods may be insufficient to simultaneously analyze 
domains containing multiple spatiotemporal scales of interest, (c) there is no standardization of 
observation data or their delivery systems across domains, (d) the size and complexity of many 
large-scale coupled Earth system models makes it is difficult to accurately represent uncertainty 
due to model parameters and coupling parameters, (e) model errors lead to local biases that can 
transfer between the different Earth system components and lead to coupled model biases and 
long-term model drift, (e) information propagation across model components with different 
spatiotemporal scales is extremely complicated, and must be improved in current coupled 
modeling frameworks, (h) there is insufficient knowledge on how to represent evolving errors in 
non-atmospheric model components (e.g. as sea ice, land, and ocean) on the timescales of NWP 


Recommendations: (a) Standardize the observing network for all earth system domains that 
meets the timeliness and quality control requirements of NWP, (b) identify gaps in the observing 
system that are essential for constraining CDA applications, including fluxes at the domain 
interfaces, (c) support international coordinated efforts to study and compare operational-scale 
CDA approaches, (d) support academic/research community to explore fundamental aspects of 
CDA via simplified and intermediate complexity models, (e) develop CDA methods that can 
assimilate multiple spatiotemporal scales in support of the seamless prediction paradigm, (f) 
develop methods for CDA to identify, isolate, and elucidate model errors and biases that degrade 
forecast skill, which can also be used to directly improve coupled modeling, (g) promote 
improved representation of model uncertainty in the forecast system using stochastic physics and 
other advanced methods 
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Summary Report and Recommendations 


Introduction 


As operational Numerical Weather Prediction (NWP) centers around the world strive to extend 
the skillful range of their forecasts, it has become increasingly acknowledged that this will 
require the modeling and initialization of not only the atmosphere but also other domains of the 
earth system. The climate and subseasonal-to-seasonal (S2S) prediction communities have 
already conducted significant effort in this area by coupling independent models to represent the 
complete earth system (e.g. Meehl, 1995; Delworth et al., 2006; Saha et al., 2006; Koster et al., 
2010; Dunne et al., 2012; Orsolini et al., 2013; Robertson et al., 2015; Kucharski et al., 2015; 
Guémas et al., 2016; White et al., 2017). Such coupled models comprise two or more 
components including the atmosphere, ocean, land, sea ice, surface waves, atmospheric aerosols, 
and ocean biogeochemistry. The challenge now becomes bridging the developments in coupled 
modeling for climate and S2S timescales down to NWP timescales, while maintaining the skill 
of atmosphere-only NWP products and simultaneously extending this skill to the longer 
timescales (Kinter et al., 2016). This challenge is particularly relevant to operational centers 
adopting a ‘seamless prediction’ paradigm (Palmer et al., 2008; Hoskins, 2013) in which a single 
coupled forecast system is applicable to weather, S2S, and longer climate timescales. 

The ongoing transition to high-resolution and coupled modeling capabilities must be 
accompanied by a similar transformation and adaptation of the DA procedures (e.g. Sun et al., 
2014). The design of efficient Coupled Data Assimilation (CDA) methods, able to 
simultaneously control all resolved scales and propagate information across the climate system 
components, is of primary importance to effectively initialize these coupled forecasting systems. 

Coupled data assimilation can be broadly categorized into two types: weakly coupled DA 
(WCDA) and strongly coupled DA (SCDA). Weakly coupled DA indicates that the coupling 
occurs during the forecast by using a coupled forecast model. For WCDA, the state of the 
coupled model is differenced with observed values to produce the innovations, but the analysis is 
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still computed independently for each domain. As a consequence, the direct impacts of 
observations on the analysis are limited to the domain in which the observations reside, while 
cross-domain impacts are produced as a secondary effect by the integration of the coupled model 
during the forecast step. Strongly coupled DA indicates that cross-domain error covariances are 
utilized so that the entire earth system is analyzed as if it were a single system. This approach has 
the advantage of allowing all available observations at a given time to have instantaneous impact 
across all domains when forming the analysis (within the limits of any imposed localization 
length scales). Strongly coupled DA thus helps the entire coupled system to be in relative 
agreement with all available observations while largely reducing initialization shocks caused by 
each component being analyzed with a disjoint set of observations. As with WCDA, SCDA also 
allows past data to be propagated through the system during the coupled forecast. In addition to 
these two approaches, there are many variations that have been attempted in the research 
community that may be viable for different applications. 

CDA has applications in prediction on many timescales, including NWP, S2S, multiyear, 
and decadal prediction. It has use for prediction in high-impact weather scenarios (e.g. hurricane 
prediction), and in regional prediction (e.g. polar prediction, Jung et al., 2016). CDA also has 
applications in reanalysis, for example in the reconstruction of historical climate. We note that 
the overall goals of these broad application areas are somewhat different. For example, the 
former typically seeks to find initial conditions that lead to the most accurate forecast, while the 
latter seeks a state estimate or trajectory that most closely represents the true state or trajectory of 
the coupled system. These differing goals may lead to different solution approaches. 


Benefits of Coupled DA 


Expected Benefits of Coupled DA 


Transitioning from WCDA to SCDA helps to reduce imbalances between domains caused when 
the separate domains are updated independently using disjoint information of the earth system. 
These imbalances are often referred to as “shocks” in the initialization of coupled systems. While 
the reduction or elimination of these shocks is expected with SCDA, there has been limited 
research supporting this claim. For example, it is unclear whether the current modeling of 
domain interfaces is sufficient to support the use of model-derived cross-domain error 
covariances in realistic applications. 

Many surface observations, in particular those made from satellite platforms, are 
influenced not only by the ocean or land surface but also the atmosphere above. CDA provides 
the opportunity for an online computation using a consistent observation operator that includes 
the forecasted state of both the surface and the overlying atmosphere. For example, Sea Surface 
Temperature (SST) and Sea Surface Salinity (SSS) measurements rely on radiative transfer 
models that should include the modeled atmosphere directly above. As another example, gravity 


measurements made by the GRACE and GOCE satellites are influenced by the total integrated 
column of mass (with variability strongly dominated by water content), from the bottom of the 
ocean to the top of the atmosphere (TOA), or throughout the land surface to the TOA. It is 
expected that this more accurate computation of the ‘model equivalent’ to the observations will 
lead to improved accuracy in CDA analyses. 

Another key opportunity for CDA is to constrain parts of the earth system that are poorly 
constrained by a single-component DA system. For example, assimilation of ice properties using 
SCDA may improve estimates of both oceanic and atmospheric states that are poorly constrained 
by independent atmospheric or oceanic DA systems because of a dearth of oceanic and 
atmospheric observations in the polar regions. Other opportunities include: (1) use of wave 
information to constrain transfer of energy between ocean waves and the atmosphere, (2) use of 
cloud observations to constrain heat exchange in the ocean, (3) use of near surface observations 
to constrain poorly known land surface conditions for the atmosphere, (4) transport of marine 
aerosols into the atmosphere, (5) to add value to new Earth observations such as the Ice-Tethered 
Profilers, measuring ocean temperature and salinity below ice (whoi.edu/page.do?pid=20756) or 
the European Space Agency (ESA) missions that provide Essential Climate Variables (ECV) 
covering components such as Soil Moisture and Ocean Salinity (SMOS), (6) use of observations 
at the air-sea interface from TPOS-type mooring arrays to better constrain the ocean-atmosphere 
coupled system in the Tropics. 

Sea ice concentrations are well observed by many types of sensors, though there are 
limitations in summer near land, thin ice, and during high winds. Ice drift is well observed from 
sequential satellite images, but there is a challenge in knowing whether to correct winds, ocean 
currents, drag coefficients, ice strength, or other parameters. Sea ice thickness is less well 
observed, but important for air/sea interaction and long term prediction. Errors in ocean SST, 
near surface currents, surface air temperature and winds can dominate ice forecast errors, 
especially since both are poorly observed near ice. Simple CDA approaches help by allowing sea 
ice observations to impose consistent SST, and to correct winds and currents. Air temperature 
and winds can be used for quality control of sea ice observations. However, linear/Gaussian 
approaches for strong coupling may not be adequate, particularly near the freezing points and at 
the creation of ridges. 


Early Evidence for the Benefits of Coupled DA 


To date, the number of studies documenting the degree of improvement in prediction and 
reanalysis applications is small but indicates great potential for CDA. 

It is expected that by ensuring consistency between the analyzed states of each coupled 
Earth system components, CDA systems will produce more balanced states and fluxes at the 
interfaces (Zhang et al. 2005,2007; Sugiura et al. 2008; Zhang 2011). Mulholland et al. (2015) 
found that forecasts initialized by CDA do reduce initialization shocks in lower atmospheric 


temperature relative to separate oceanic and atmospheric analyses, in some regions halving the 
RMSE on the first day of the forecast with sustained improvements for the duration of 10-day 
forecasts. As a cautionary note, improvements were found to be negligible when assessed using 
independent datasets over the limited period studied. 

Tardif et al. (2014, 2015) used a SCDA approach and found that assimilating 
time-average observations into an idealized low-order coupled system improved state estimates 
at interannual to decadal timescales. In latest research efforts reported during the workshop, a 
multi-time scale extension of the DA method, where the long-timescale ocean states were 
constrained with atmospheric observations on 20-year timescales as well as monthly, provided 
more accurate analyses of fast and slow components of the system over the assimilation of 
non-averaged observations. A clear benefit of the SCDA approach was highlighted in a poorly 
observed ocean scenario. 

Lu et al. (2015a/b) used what they called a Leading Averaged Coupled Covariance 
(LACC) CDA approach, attempting to address different time scales between model components. 
They focused on the effect of assimilating atmospheric surface temperature on the sea surface 
temperature (SST) in a coupled general circulation model (GCM). 

Sluka et al. (2016) indicated significant improvement (over 40%) in a perfect model 
scenario by using SCDA versus WCDA in an observing system simulation experiment (OSSE) 
that assimilated atmospheric observations into the ocean. The system used an intermediate 
complexity global coupled model (SPEEDY/NEMO) at coarse resolution with a 6-hour DA 
coupling interval. Improvements were present throughout the full ocean water column 
throughout most of the world ocean, at all latitudes except the arctic. 

It has been shown that the ocean coupling is important for the analysis step to correctly 
estimate the initial conditions for tropical cyclones (Laloyaux et al., 2016b). Uncoupled 
hurricane models typically create too-deep TCs because there is no cold-wake in the SST field. 
In the coupled case, mean sea level pressure (MSLP) forecast error is reduced 
(http://www.ecmwf.int/en/elibrary/16980-tropical-cyclone-sensitivity-ocean-couplin 

In ECMWF’s CERA-20C 


(http://www.ecmwf.int/en/research/climate-reanalysis/cera-20c), the ocean and the atmosphere 
communicate every hour through the air—sea coupling at the outer-loop level of the variational 


method. Changes in the state of the atmosphere may directly impact the ocean properties and 
vice versa. The combination of CDA and additional improvements in the atmospheric DA 
corrected a spurious trend in the net heat fluxes received by the ocean, as identified in ORA-20C. 
On average, heat flux and ocean temperature increments in CERA-20C oscillate around 0 W/mz2, 
suggesting a more balanced system state estimate. In addition, in the ERA 20C there were no 
Tropical Instability Waves (TIW) or wind stress signals (the system was forced by monthly 
SSTs). The CERA-20C does produce TIWs in the ocean, with the atmosphere responding 
accordingly, leading to an improved correlation between SST and wind stress. Similarly, the 
CFSR coupled reanalysis produced by NCEP also produced TIWs (Wen et al., 2012). 


Fujii et al. (2009; 2011) at JMA/MRI compared a quasi-WCDA system (i.e. the ocean 
component alone is assimilated) using Incremental Analysis Updates (IAU) with an analysis 
interval of 1 month versus an AMIP run forced by observed SST. Forcing the AMIP with 
observed SST did not lead to a more accurate analysis. Instead, precipitation was overestimated 
in the West Pacific, and underestimated in the Indian Ocean in winter. In summer, the position of 
the peak ITCZ was different in the AMIP, but modified by quasi-WCDA to be more accurate. 
With quasi-WCDA, precipitation corresponded to cyclonic structure, which did not occur in the 
AMIP run, and too-weak winds in the monsoon troughs were improved. In observed data, the 
correlations between SST and precipitation are not as strong as modeled in AMIP runs. Many of 
the improvements stemmed from the representation of the more realistic negative feedbacks 
between SST and precipitation in the coupled system, which reduced excessive rainfall over high 
SST regions. With quasi-WCDA, the false maximum over the western side of the Bay of Bengal 
disappeared and the contrast of the upper-troposphere velocity potential between the Pacific and 
Indian Ocean was intensified, resulting in enhancement of the Walker circulation and the 
monsoon trough in the western tropical Pacific. 

Lin et al. (2016) indicated that the decadal shifts of East Asian summer monsoon were 
not recovered in simulations of an uncoupled atmospheric model, but correctly reproduced by a 
quasi-WCDA system (developed in IAP, China). A 7-day analysis cycle was used for the ocean 
data assimilation. They confirmed that the decadal shifts were not correctly reproduced when the 
assimilation interval is 1 day or 3 days, which indicates that at least a loose constraint of the 
ocean surface state is required to improve the atmospheric fields through reconstruction of the 
negative feedback between SST and precipitation. 

Lea et al. (2015) showed implementation of WCDA and the use of assimilation outputs 
to diagnose coupled model errors (e.g. diurnal cycle of SST, issues with river outflows). Lea also 
demonstrated smaller average SST increments than in uncoupled DA indicating a better balanced 
analysis at the air-sea interface. 

An excess latent heat flux seen in the JRA-55 is reduced in a WCDA system with a 
10-day (6-hour) ocean (atmosphere) cycle recently developed in JMA/MRI. The WCDA also 
suppressed excess precipitation seen in the JRA-55. These improvements may have also been 
due to improvement in the bulk formulae. There was an improved (weakened) ITCZ as well. 

There are indications that skilful ocean and sea ice initialization requires CDA because 
ocean and sea ice are tightly interconnected thermodynamically and ocean observations under 
sea-ice are lacking (Lisceter et al. 2003, Sakov et al. 2010, Massonnet et al. 2013). Flow 
dependent assimilation methods have been shown useful for representing the coupled error 
covariance between sea ice and ocean salinity that are non-stationary and anisotropic and crucial 
for preserving the ocean stratification (Lisceter et al. 2003, Sakov et al. 2010). 


Detailed Recommendations 


The following recommendations have been identified by the workshop participants through 


presentations, breakout discussions, and plenary session discussions, as important items that 


must be addressed for the successful development of CDA. The recommendations are separated 
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into categories: “Research and Methodology”, “Organizational Planning”, and “Observing 


Missions”. 


Research and Methodology 


Organized research efforts are needed among the international community to evaluate the 
benefits of (a) CDA versus independent DA for stand-alone modeling systems, and (b) 
the benefits of WCDA versus SCDA. 

Research efforts are needed to examine the benefits and trade-offs to spanning the scale 
from weakly coupled to strongly coupled DA, as some centers may find larger costs are 
required to execute a full transition to SCDA. 

With the goal of seamless prediction, the international research community must develop 
and evaluate the impact of new methods for separating spatial and temporal scales in 
CDA (e.g. multiscale or multigrid DA). 

On a practical level, investigation must be done on the impact of updating different 
model components at different timescales - a practice that is currently used to maintain 
stability in a few major CDA efforts, without rigorous theoretical justification. 

The research community must identify the sources of first order errors that may prevent 
coupled forecast systems from clearly outperforming current operational NWP systems. 
For the purpose of seamless prediction, the research community must demonstrate that 
CDA not only improves initial states but also improves forecasts at all time scales. 

As a specific focus, evaluations should be made to identify the degree to which coupled 
modeling and CDA can improve medium-range atmospheric forecasts. Such evaluations 
should identify the benefits versus tradeoffs for varying degrees of coupling. 

The international research community must demonstrate the benefit of applying coupled 
observation operators within CDA systems, in comparison to current observation 
operator formulations, and in comparison to the assimilation of retrieval products. 

Some form of hybrid DA methods are likely to be necessary. From the variational 
perspective, complicated cross-domain covariances exist on multiple spatiotemporal 
scales and only the generation of real-time ensembles will adequately capture the 
dynamically varying covariances. From the EnKF perspective, the models are biased and 
a purely model-derived error covariance will overestimate the accuracy of the model and 
potentially ignore quality observations. Hybrid methods are maturing but still represent 


an emerging field in DA, and relatively little effort has been put into applying hybrid 
methods to CDA applications. 

Further advances are required to apply nonlinear/non-Gaussian methods to the more 
nonlinear domains of the coupled earth system (e.g. for sea-ice), and to integrate these 
with the domains for which linear/Gaussian-based methods are adequate. 

The DA approaches for all domains must generate consistent analyses. It is common for 
weakly coupled DA to generate analyses for each domain that are inconsistent with one 
another, with the potential to create shocks when used to initialize the coupled model. A 
primary motivation for SCDA is to improve consistency and eliminate these shocks. 
CDA should be used whenever possible when applying DA to non-atmospheric domains. 
For example, low resolution ocean models behave largely as forced-damped systems that 
respond linearly to the atmospheric forcing. Because the atmospheric models typically 
provide a majority of the resolved nonlinearities on short timescales, coupling the ocean 
to the atmosphere using CDA provides the opportunity for lower resolution ocean models 
to be treated more appropriately as nonlinear dynamical systems. 

A first order challenge for CDA is determining what to do for drift caused by biases in 
coupled models. Accurate estimates of the interfaces between domains should be used to 
characterize systematic errors in each of the components and the fluxes between those 
components. This requires also (a) improved modeling of the interfaces between 
domains, and (b) improved observing of these interfaces. 

Assimilating near-interface observations with a focus on the temperature (e.g. SST, LST) 
at the interface between many model component combinations would serve as a useful 
exercise to evaluate coupled error covariances. Such a focused experiment may also 
provide an opportunity to coordinate between multiple institutions and research teams. 
Coherence between initial conditions of slow and fast modes relies on cross-domain error 
covariances. Further research is needed to determine the appropriate characterization of 
these cross-domain error covariances. A survey is needed to identify reliable means to 
estimate these quantities for atmos/ocean, ocean/sea ice, atmos/land, ocean/land, etc. 
Grid choice, choice of vertical coordinate, and choice of interpolation method to interface 
multiple domains can impact the accuracy of coupled models and CDA systems. 
Research is needed to understand the amount of error introduced due to these sources, 
and quantify how much that error degrades the coupled model forecast. 

The impacts of WCDA versus SCDA should be quantified for the coupled ocean/sea ice 
setting, and coupling between other Earth system domain pairs (e.g. atmos/chem, 
ocean/land, etc.) should be thoroughly investigated as well. For example, SCDA may be 
a necessity for the coupled ocean/sea ice problem. If too much ice is melted the 
stratification can be degraded and this may have wider impacts on the overturning 
circulation. 
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The time-varying nature of cross-domain error covariances must be investigated in detail. 
These correlations change based on diurnal, seasonal, and longer timescales. The research 
community should investigate how the cross-domain error covariance structure varies 
daily, seasonally, or on longer timescales. 

Large ensembles may be needed in order to estimate error covariances outside of the 
boundary layers. It must be identified what is lost to our estimates both inside and outside 
the boundary layers when small ensemble approximations are used instead. 

Evaluate the impact of resolution on cross-domain error covariances. For example, 
air-sea interaction plays an important role in Tropical Cyclone (TC) development - the 
ocean provides energy for the TC, the TC-induced cold wake acts as a break to prevent 
over-intensification. The TC air-sea coupling effect becomes more impactful when there 
are ocean eddies present. 

From a methods perspective, the key challenge for the next 3-5 years is to harden our 
tools for coupled modeling and CDA. In a 3-7 years timeframe, the community can start 
addressing forecast and observation impact problems. To expedite research in the next 5 
years, the community should establish avenues for exchanging tools for 
single-component DA and CDA. 

For reanalysis applications, include a transition toward SCDA to make better use of the 
sparse historical observing network. Care must be taken to do this without also 
transferring biases (from both model and observations) between domains. 

While ensemble methods seem a straightforward strategy to estimate background error 
covariance, it is possible that deficiencies in the cross-domain interface dynamics may 
render large errors in these estimates. New methods and diagnostics are needed to 
evaluate the cross-domain error covariance derived from coupled ensembles. 

On a more fundamental level, a deeper understanding on the perturbation dynamics in 
coupled system is necessary primarily to design ensemble-based methods for CDA. For 
any coupled modeling system configuration, the saturation time scales and amplitudes of 
each component should be quantified, and it should be evaluated how these affect the 
other (faster and slower) model components. 

A better understanding of the impact of representing model error in CDA systems is 
necessary. This includes, for example, the impact of stochastic parameterizations, bias 
corrections, surface flux relaxations and other frequently used tools used to correct model 
errors in DA systems. 

Representations of surface fluxes must be improved, whether through improvements to 
the bulk formulae or improved resolution and modeling of the near surface boundary 
layers. 

CDA OSSEs should be a priority application for future observation network design, and 
to help plan and prioritize future observation campaigns. 
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Organizational Planning 


The culture of many non-atmospheric observing efforts, which have been focused 
primarily on climate applications, are not set up in a way to transmit observational data 
rapidly to operational centers for use in NWP. A shift is required that will allow ocean, 
snow, sea-ice, and other data to be used in near real-time NWP applications. 

Established pathways for major operational centers to ease the transition from weakly 
coupled DA systems pieced together from existing component systems to an integrated 
Strongly Coupled DA system with flexible capabilities to recover the component-wise 
DA as needed. 

Establish a research-to-operations pipeline to allow the academic research community the 
ability to (a) address scientific questions of interest to the operational centers, and (b) 
facilitate the transfer of new methods and knowledge from the research community to the 
operational centers. 

Within WMO, CDA may also require developing standards for exchanging 
non-atmospheric data, e.g., appropriate grib format standards. 

There is a need for sustained working groups and regular workshops that will promote 
international exchange of information in this diverse community 

Establish recommendations for assessing the impact of coupled NWP developments 
which take into account all state components. What length of trials are required to 
demonstrate impact of changes on operational systems? How should the verification of 
coupled NWP systems be carried out? 

Establish common data sets/fields, such as (observational) innovations, air-sea fluxes for 
a common time-period between research groups for intercomparison. 

Bridging if possible with applied mathematics community that has experienced in similar 
problems of stochastic filtering and state estimation in multiple scales systems (biology, 
neural network or electric network). 


Observing Missions 


Increase the observing effort of the inter-domain interfaces. This includes measurements 
of air-sea fluxes, ice-ocean fluxes, air-land fluxes, etc. Greater “co-located” observations 
spanning multiple domains are needed to constrain the coupled system and correct biases 
in observations, and estimate observation errors. Focused field campaigns may be 
valuable in a carefully selected region of the globe to do focused tests of coupled DA on 
a regional scale. As a general guiding principle for planning observing missions, DA 
typically benefits more from larger quantities of well-distributed moderate-accuracy data 
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rather than a small quantity of high-accuracy data. (In this case, ‘high-accuracy’ is 
defined as greater than the accuracy resolved by the model.) 

e Increase the observing effort for areas of the earth system that are under-observed and 
underconstrained. As we shift from forced single-domain models to coupled earth system 
models, the predominant impacts from biases in one domain can shift to another. In order 
to constrain these biases there must be a concerted effort to constrain long ignored 
regions, such as the deep ocean, sea ice, and the ocean under sea ice. 

e Establish a mechanism for developing countries to securely contribute local observations 
to major global NWP operational forecasts. Further, establish a visiting scientist plan for 
providing expert training in DA and NWP to forecasters, researchers, and students from 
these countries. 

e Encourage field campaigns that plan for co-located observations. Increase collaboration 
between field campaigns, modelers, and CDA community for conducting process studies 
that cross domain boundaries. 

e The coupled modeling and data assimilation community should identify key opportunities 
for improving earth-system predictability through better formulation of either forecast, 
observation, or data assimilation methods informed by the insights from dedicated 
observational campaigns. Examples of successful field campaigns in the past included 
studies of coupling in tropical cyclones, marginal ice zone, and Madden-Julian oscillation 
(MJO). 


Comprehensive Workshop Review 


Workshop Description 


A workshop was held October 18-21, 2016, regarding the current state and future of Coupled 
Data Assimilation (CDA). Hosted in Toulouse, sponsorship included the WMO, Meteo-France, 
ERA-CLIM2, NOAA Climate Program Office (CPO) Modeling Analysis Prediction and 
Projections (MAPP) program. The workshop had representation from ECMWF, Met Office, 
NOAA, NASA, NRL, BOM, JMA, JAMSTEC, ECCC, Météo France, and the academic 
community. A brief workshop summary is given by Penny and Hamill (2017), while this 
document provides a more thorough discussion of the proceedings. 


Further information, including archived presentations, can be found at: 
http://www.meteo.fr/cic/meetings/2016/CDAW2016/ 
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Role and Goals of the WMO 


The World Meteorological Organization (WMO) World Weather Research Programme (WWRP) 
“ensures the implementation of a research strategy towards the seamless prediction of the Earth 
system from minutes to months,” (Brunet, Jones, and Ruti, 2015) as well as estimating forecast 
impacts for downstream end user needs and decision making. The purpose of this workshop was 
to provide WMO with concrete actions to apply CDA in the context of the WWRP plan. The 
WMO working group on Data Assimilation (DAOS) recognizes CDA as one of the frontiers of 
DA research. From the perspective of the CDA community, the WMO can play a significant role 
in facilitating the coordination of efforts across multiple countries. 


Further questions of interest to the DAOS working group include: 

1) How can new and emerging data sources be considered for DA in the future, especially for 
data sparse regions like the African continents - the same needs exist (floods, etc.) but with fewer 
data. 

2) How can this DA group contribute to field campaigns 

3) Are closer links needed with nowcasting and mesoscale research for high resolution modeling 
4) What can the DA community do to benefit all of the WMO members? 

5) DA guidelines for developed and developing countries 


Concepts and Terminology for Coupled Data Assimilation 


The terms ‘domain’, ‘component’, ‘sub-domain’, and ‘sub-component’ are all used somewhat 
interchangeably to refer to one part of the total coupled earth system that may be isolated 
conceptually (e.g. atmosphere, ocean, sea-ice, land, wave, aerosol, etc.). We generally refer to a 
‘domain’ of the earth system, or a ‘component’ of an earth system model. 


The existing methods for CDA exist on a spectrum (the list is not exhaustive): 

@ Quasi Weakly Coupled DA (Quasi-WCDA): assimilation is applied independently to 
each of a subset of components of the coupled model. The result may be used to initialize 
a coupled forecast. 

e Weakly Coupled DA (WCDA): assimilation is applied to each of the components of the 
coupled model independently, while interaction between the components is provided by 
the coupled forecast system 

@ Quasi Strongly Coupled DA (Quasi-SCDA): observations are assimilated from a subset 
of components of the coupled system. The observations are permitted to influence other 
components during the analysis phase, but the coupled system is not necessarily treated 
as a single integrated system at all stages of the process. 
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e Strongly Coupled DA (SCDA);: assimilation is applied to the full earth system state 
simultaneously, treating the coupled system as one single integrated system 


This ‘bottom up’ approach to defining CDA is born out of the construction of CDA from 
existing DA systems for each separate domain of the coupled earth system. The problem may 
also be considered from a ‘top down’ abstraction - all DA applied to a coupled model begins 
conceptually as SCDA which is simplified using localization, either applied explicitly or applied 
implicitly by the decision to update only part of the total domain. As a frame of reference, NCEP 
currently initializes the Climate Forecast System (CFSv2) for seasonal prediction using WCDA. 
For the CFSv2, each component is initialized with its own independent DA system: the 
atmosphere uses 3DVar (GSI), the ocean uses 3D Var (GODAS), and sea ice uses nudging. 


A Summary of Coupled DA Efforts at Operational Centers 


ECMWE 


ECMWFE is currently testing a quasi-SCDA system called CERA which is based on a variational 
method with a common 24-hour assimilation window shared by the atmospheric and ocean 
components. The coupled model is introduced at the outer-loop level by coupling ECMWF’s 
Integrated Forecasting System (IFS) for the atmosphere, land and waves to the NEMO model for 
the ocean and to the LIM2 model for sea ice (Laloyaux et. al, 2016a). This means that air—sea 
interactions are taken into account when observation misfits are computed and when the 
increments are applied to the initial condition. In this context, ocean observations can have a 
direct impact on the atmospheric analysis and, conversely, atmospheric observations can have an 
immediate impact on the analysed state of the ocean. The increased complexity is intended to 
improve medium-range forecasts by extending prediction horizon (Mulholland et. al, 2015), 
making better use of observations (Laloyaux et. al, 2016b), and providing new applications. The 
system has implemented a relaxation scheme to constrain the SSTs. 

ECMWFE has completed the production of a new global 20th-century reanalysis which 
aims to reconstruct the past weather and climate of the Earth system including the atmosphere, 
ocean, land, waves and sea ice. This coupled climate reanalysis based on the CERA system and 
called CERA-20C assimilates only surface pressure and marine wind observations as well as 
ocean temperature and salinity profiles. The air—sea interface is relaxed towards the sea-surface 
temperature from the HadISST2 monthly product to avoid model drift while enabling the 
simulation of coupled processes. No data assimilation is performed in the land, wave and sea-ice 
components, but the use of the coupled model ensures a dynamically consistent Earth system 
estimate at any time. 

ECMWF’s Roadmap to 2025, which summarises the Centre’s new ten-year Strategy, 
highlights that, “As forecasts progress towards coupled modelling, interactions between the 
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different components need to be fully taken into account, not only during the forecast but also for 
the definition of the initial conditions of the forecasts.” In this context, ECMWFE is now 
producing CERA-SAT which is based on the CERA system at higher resolution with the full 
observing system (satellite, upper air, land, wave, sea ice). A proof-of-concept will be delivered 
over a recent period. 


NOAA/NCEP 


The previous NCEP CFSv2 (Saha et al., 2014; Saha et al., 2010) used the 3DVar-GSI analysis 
scheme, assimilated satellite radiances with variational bias correction, for the ocean used the 
3DVar-GODAS approach assimilating only temperature in a 1/2° model with 1/4° refinement at 
the equator, for land used the Noah land model with NASA LIS DA forced by the CFSv2, and 
for sea ice applied nudging toward an offline sea ice concentration analysis. 

The new paradigm at NCEP is to have the prediction at all scales (weather, subseasonal, 
seasonal) be ensemble-based. As part of the strategy for the Next-Generation Global Prediction 
System (NGGPS), NCEP is developing a unified modeling framework that will use a NOAA 
Earth System Modeling System (NEMS) coupler to allow different earth system component 
models to exchange fluxes. The interface software is NUOPC based. The NGGPS system 
consists of atmosphere (FV3 dycore + GFS physics), ocean (MOM6/HY COM), waves 
(WAVEWATCH IID), land (NOAH), sea ice (CICES/SIS2/KISS) component models. The design 
is flexible to allow for different components to be coupled together depending upon the 
prediction scales. 

NCEP is currently upgrading its seasonal forecasting system to serve as one of the 
prototypes for its coupled modeling framework. The coupled system will include atmosphere, 
ocean, waves, aerosol, land and sea ice components. The current plans for the seasonal forecast 
system leverage independent DA developments for each of the components, using Hybrid-EnVar 
for the atmosphere, and Hybrid-EnKF for many of the other components. The current 
expectation is to cycle at 6-hour intervals. An effort to standardize the DA software has been 
initiated by the Joint Center for Satellite Data Assimilation (JCSDA) under the title Joint Effort 
for Data Assimilation Integration (JEDI). 

NCEP is developing a community based Unified Modeling framework to provide 
forecast guidance from weather to seasonal timescales by a target of 2022. Weather timescales 
are roughly described as 10-day forecasts, 10km resolution, 3-year reanalyses, updated yearly. 
Sub-seasonal applications are planned for forecasts to 6-weeks, with 30 km resolution, 20-30 
members per day, 20+ years reanalysis and reforecasts (e.g. 1999-present), upgraded every 2 
years. Seasonal applications target 1-year forecast lengths (extended from the current 9-months), 
50km resolution, 40-member lagged ensemble (1979-present), with an upgrade frequency every 
4 years. 
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JMA (JMA/MRI) 


A prototype WCDA system was built in March 2016 in JMA/MRI. JMA/MRI had previous 
experience developing a quasi-WCDA system (only ocean observations assimilated into a 
coupled model). The new prototype system at JMA/MRI is designed to replace the ocean-only 
obs assimilation approach. A unique point of the system is using a 10-day ocean DA cycle which 
is much longer than the 6-hour atmospheric DA cycle. The coupled system uses an atmosphere 
component at T159L60, and an ocean component at 0.5°x1° (latitude x longitude). The 
atmosphere component is updated every 6-hours by 4D-VAR with a TL159L100 uncoupled 
inner-loop model, while the ocean component runs on a 10-day cycle using 3DVAR with IAU. 
This long DA cycle for the ocean component comes from the experience that reconstruction of 
the SST-precipitation negative feedback with a monthly DA cycle improves the atmospheric 
fields in the quasi WCDA system (Fujii et al. 2009; 2011). Coupled reanalysis experiments are 
performed for the period from Nov. 2013 to Dec. 2015. Numerical Weather Prediction 
experiments are also conducted currently. 


JAMSTEC 


JAMSTEC has developed a SCDA system using 4DVAR with fully coupled atmosphere-ocean 
adjoint model until 2007 (Sugiura et al., 2008). The coupled system uses an atmospheric 
component at T42L24 and an ocean component at 1°. The system is currently used for 
experimental seasonal and decadal predictions (Masuda et al., 2015; Mochizuki et al., 2016). 

JAMSTEC is also exploring initialization of WCDA using LETKF. Currently a coupled 
model with an atmosphere component at T42L20 and an ocean component at 1.4° for the 
experiments. Its expected applications are climate reanalysis, seamless climate prediction, and 
understanding mechanisms of natural variability on various timescales. 


BOM 


BOM is focusing on 82S prediction. Substantial improvements have been made by modifying 
DA and ensemble generation. Coupled model breeding method is now used to produce an 
ensemble of perturbed atmosphere and ocean states. Coupled ensemble breeding scheme 
(coupled bred perturbations) led to improved forecast reliability and skill. Atmospheric 
perturbations produced a more reliable ensemble for sub-seasonal prediction than ???. The 
preliminary version of the BOM EnKF/EnOI WCDA and ensemble generation for ACCESS-S2. 
The ocean will use T/S profiles (possible use of altimeter in S3). The atmos will be nudged to a 
pre-existing analysis (direct assimilation will be used in $3). The EnOI ensemble uses 92 
members standard (23x4) for each month, augmented 184.Observation error is estimated using 
an adaptive moderation of observations (Sakov et al. 2012). An ‘r-factor’ is used to inflate the 
observation error for the update of the ensemble anomaly. A ‘k-factor’ is a metric that ensure the 


18 


updates remains within k-times the ensemble standard deviations. The r-factor is essentially used 
to hand-tune the relative importance of observations versus the model forecast. They currently 
start estimating the observation error with the instrument error and hand-tune the ‘r-factor’ from 
there. In summary, BOM intends to support an S2S system - the implementation of an EnKF 
using a WCDA and ensemble generation with the ACCESS-S2 and S3. 


UK Met Office 


The UK Met Office has been focusing up to now on weakly coupled DA and its demonstration in 
an operational environment. Initial work to develop a global WCDA system was based on a 
coupled model comprising a 4 degree ocean/sea-ice model and 60km resolution atmosphere/land 
model, each with its own DA system. Early results from this WCDA system are reported in Lea 
et al. (2015) where a 13-month run of the WCDA system was compared to equivalent uncoupled 
atmosphere/land and ocean/sea-ice DA systems. Results were generally positive with similar 
innovation statistics from the WCDA as from the equivalent uncoupled runs. The average SST 
increments in the WCDA system were smaller than the ocean system, indicating a better 
ocean/atmosphere balance. However, some aspects of the coupled model used in the WCDA 
system needed improving, particularly in the diurnal cycle of SST which was over-amplified in 
the coupled model, in the river-runoff in certain regions, and in the assimilation of surface 
temperature data over lakes. Large differences were also seen over the Arctic. These aspects 
highlight the usefulness of WCDA for providing information about biases in the coupled model 
(as opposed to the individual component models). 

Since the work of Lea et al. (2015), a technical implementation of an operational version 
of the WCDA system was developed while addressing some of the coupled model issues 
mentioned above. Particular attention was paid to dealing with ocean observations, which 
typically take longer to arrive than atmospheric observations, to make sure a high proportion of 
them can be assimilated within the WCDA system when running in near real-time. The system 
was implemented in 2016 in the operational suite at the Met Office to demonstrate the feasibility 
for coupled NWP (not yet providing products to customers), and the atmospheric resolution 
increased to 40km. 

Upcoming work includes the increase in atmospheric resolution of the above WCDA 
system to that of the operational global NWP forecasting system (about 10km) and an 
assessment of whether the WCDA provides benefit over the existing uncoupled NWP system. If 
it is demonstrated that the WCDA system does provide improved forecast skill, it is planned to 
become the main operational global NWP system at the Met Office, and work to implement a 
coupled ensemble prediction system will follow. 
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NASA/GMAO 


NASA/GMAO is developing WCDA, with the background fields coming from an AOCGM. The 
choice of WCDA was due to latency of the ocean observations, slower timescales of the deep 
ocean, and direct assimilation of satellite radiances which are mostly influenced by the ocean 
surface. The latter point emphasizes that the surface fields must be correctly estimated in order to 
use radiance observations effectively. 

The NASA plan is a coupling that transitioned from none, to semi-; and in future, to 
weak. The GEOS AGCM has transitioned from using a prescribed ocean surface to a prognostic 
SST, and in future to a prognostic SST above an OGCM. The atmospheric analysis has 
transitioned from no analysis of SST to analysis of SST. Future development will also include 
the OGCM, such that it will resolve a diurnal cycle (including runoff) and the ocean analysis will 
transition from prescribed atmosphere excluding diurnal obs, to an analyzed atmosphere that 
includes analysis of diurnal observations. 

Analysis for diurnal varying SST is important, because SST diurnal warming can produce 
a bias from 0-4°K. The background SST variability is dependent on a number of factors. 
Variability due to diurnal warming is high with low winds and low with high winds. Winds at 7 
m/s lead to strongly mixed, while 2.5 m/s degrades the relationship, and calm winds can create a 
sharp gradient near the surface. 

OSTIA SST is a foundation SST that excludes the diurnal signal. Until recently Ts was 
based on the SST_fnd from OSTIA and net heat flux at the surface was a diagnostic field. In the 
latest GMAO operational system, a prognostic model for Ts, an analysis for Ts, and a Ts analysis 
increment are all included, hence “coupling” Ts assimilation to upper-air atmosphere (u,v,t,q) 
assimilation. Just as for SST retrievals, all in situ surface and satellite radiance observations 
contribute to the analysis of skin temperature. 

The Ts model calculates a cool skin layer, with diurnal warming below that layer. The 
foundation depth is assumed given and cool skin depth is calculated empirically. The Ts analysis 
is conducted by first finding a temperature profile that fits vertical variation of observed 
temperature. Brightness temperature (Tb) and the Jacobian dTb/dTz are computed using the 
CRTM. 

Results show consistent impacts of resolving the diurnal cycle throughout all 
experiments. There is a lag for impact of insolation on the SSTs. The impact is cooler SSTs on 
daily average, but with warmer peaks in the mid-day. The Ts analysis is tightly coupled to the 
atmosphere, and gives neutral to positive improvement in forecast skill close to the surface 
(Akella et al., 2016). 

An important note of warning is that the diurnal cycle is not restricted to only the top few 
meters. In fact, observations in the Equatorial Atlantic PIRATA measurements show the diurnal 
cycle can penetrate as deep as 20-30m. 
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A future plan is to move to weakly coupled DA, and include analysis for SST and sea-ice on top 
of an analyzed ocean. A longer term target is the continental shelves, ice shelves, ice sheets, etc. 
For future reanalyses, the GMAO is moving towards an Earth system approach, ultimately with 
coupled DA of atmosphere, ocean, land, ice, aerosols and chemistry; MERRA-2 reanalysis 
includes a coupled analysis (weak coupling) of aerosols and atmospheric state (Gelaro et al., 
2017). 


US Navy/NRL 

Global coupled model initial operating capability planned for 2018, using an ocean at 1/25° for 
deterministic short-term forecasts, and 1/12° for probabilistic long-term forecasts. The initial 
coupled DA implementation will take a weak approach with Hybrid-4DVar used for the 
atmosphere, LIS used for the LSM, 3DVar used for aerosols, no DA used for waves (due to the 
strongly forced nature of the system, though 2DVar is in research), 3D Var used for the ocean, 
and nudging to ice concentration analysis used for the sea ice. The planned system will use a 
mixed-length approach to specifying atmospheric and oceanic DA windows. The atmospheric 
DA window will remain at 6 hours in the near future and the oceanic window will remain at 24 
hours. The initial experiments at shortening the oceanic window to 6 hours degraded the ocean 
forecast skill. 

The US Navy plans to implement strongly coupled capabilities in the global system 
following the interface solver approach (Frolov et.al. 2016) In this approach, each system will 
retain their own DA systems but will incorporate the information about cross-fluid covariances 
using ensemble information. This is useful in cases where well-established DA systems exist 
for each domain, and cross-domain correlations exist, but combining the systems is a technical 
challenge. 

In addition to the global S2S system several regional CDA systems are currently in 
development. A coupled ocean/atmos 4DVar is being developed. The goal is to move away from 
separate analyses that can eventually lead to unbalanced coupled system states. Each domain 
keeps its 4DVar system, and the fact is exploited that the effective covariance is determined by 
the action of the adjoint prescribed in the TLM. So the information from the atmosphere is 
projected into the ocean from the TLM and vice versa. This generates a localized 
dynamically-driven cross-covariance estimate. In addition to the coupled regional 4D VAR 
system, NRL is investigating merits of a pure ensemble system. 

In the future, the regional coupled 4DVAR project might expand to the global 
ocean/wave scale. This means that HYCOM will be used as the forecast model, and corrections 
will be computed with the NCOM 4DVar. The TLM and adjoint will be based on a hybrid 
background, but propagated with NCOM. 4DVar systems will be developed for WaveWatchl II, 
including a TLM and adjoint, and will use ESMF coupling interfaces. 
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An ensemble-estimated TLM is being also developed by NRL called the Local Ensemble 
Tangent Linear Model (LETLM) as described by Frolov and Bishop (2016), Bishop et al. (2016) 
and Allen et al. (2017). This may allow for a straightforward construction of a TLM for a 
complex coupled model environment. The adjoint of the TLM can be used for forecast 
sensitivity studies, allowing the simulations to be run backwards in time, and can be used by 
more sophisticated data assimilation techniques such as 4DVar. The maintenance for an 
ensemble-based TLM is much simpler than traditional TLMs, thus allowing quick updates of the 
DA system after model upgrades. 


NCAR 


The Data Assimilation Research Testbed (DART) is a software implementation of various 
ensemble methods. DART comes with the EAKF, EnKF, RHKF, quadratic ensemble filter, and a 
localized particle filter implementation. Unlike the LETKF, which uses a gridpoint-based 
analysis, DART uses an observation-centric analysis; observations are processed serially, but the 
state-vector update due to each observation is computed in parallel. As with any assimilation 
method, one challenging area of implementation is in the prescription of how an observation 
will influence the state. In DART (as in most ensemble methods) this is done via the 
prescription of a localization scheme that limits the influence of an observation on physically 
distant state variables. 

Because DART is a modular system, new observation types and forward operators can 
be added with relative ease via a fixed set of required routines. A single module can implement 
observation operator functions that can make use of model state variables interpolated to any 
location. There are also several adaptive inflation algorithms available 

DART has been implemented with the POP (ocean), CAM (atmosphere), CICE (ice) and 
CLM (land) single component systems of the CESM model. In terms of coupled global data 
assimilation, NCAR has been working primarily with a weakly coupled system based on 
combining the CAMDART and POPDART systems. In this configuration, 30-member DART 
EAKF updates are implemented separately for the ocean and atmosphere systems -- assimilating 
in situ temperature and salinity profiles (XBTs, MBTs, CTDs, drifters) into the ocean state at 24 
hour intervals and radiosonde temperature, wind and aircraft data at 6 hour intervals into the 
atmosphere. Initial tests of this configuration have been examined over 1970-1981, a time when 
observational coverage even in the atmosphere is mainly over the northern hemisphere. 
Evidenced by comparison to multiple data sources, results show that the system is able to 
constrain both atmosphere and upper ocean variables to reflect interannual variability and 
large-scale synoptic variability in the northern hemisphere. 

Using the results of the weakly coupled system, work has been done examining the 
ensemble correlations (1.e. scaled error covariances) in the coupled system with the intent of 
better understanding how the oceanic and atmospheric data constraints may be useful in a 
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strongly coupled DA framework. In general, significant ensemble error covariances between the 
atmosphere and ocean are restricted to within the planetary boundary layer and mixed layer of 
the ocean and are highest in the tropics. In the atmosphere, the summertime hemisphere tends to 
have higher correlations across the boundary than the wintertime hemisphere. Correlations are 
found to be spatially and temporally inhomogeneous, such that building an effective 
cross-component localization scheme is a challenging research topic. While there are general 
heuristics for reasonable localization radii in the ocean and atmosphere, there is no general 
guidance for how to localize across component boundaries. 

Because DART has always treated model state vector data as a 1D array of individual 
points (only the model-dependent parts need to know about the connectivity) the development of 
strongly-coupled versions of the DART system was achieved through a relatively 
straightforward concatenation of the ocean and atmosphere state vectors. At this point a 
strongly-coupled ocean/atmosphere system has been implemented and prototyped in a 6 month 
experiment. There were some initial implementation issues that required attention. For example, 
both localization and inflation schemes in DART have historically had component-specific 
parameter settings. Extension to multiple components thus requires either that multiple 
components share one set of parameter settings or that the schemes be generalized. New 
definitions of observation types (e.g. temperature) to specify to which component they belong 
and rethinking how to generalize “vertical distance” given the different coordinate systems in the 
ocean and atmosphere also required attention. Work to make these types of issues more seamless 
within DART is ongoing. 


ECGE 


Environment and Climate Change Canada (ECCC) currently performs near real-time global 
coupled atmosphere-ocean-ice forecasts in “experimental mode” using the NEMO ocean model, 
CICE ice model and the GEM atmospheric model. The analyses for the ocean are obtained using 
a modified version of the Mercator data assimilation system (SAM2) with a daily assimilation 
cycle (assimilating only SST) combined with a weekly cycle (assimilating SST, altimeter data, 
Argo profiles and other in situ observations). This coupled forecast system will likely become 
operational, replacing the uncoupled atmospheric model forecasts, in the near future. Even 
without coupled data assimilation, the initial conditions for each component is reasonably 
consistent with the others since they all use the same SST and sea ice concentration analyses. 
Work has recently begun to implement a WCDA approach for these global deterministic 
models in which the coupled background forecast is used for both the 4D-EnVar atmospheric 
analysis and the daily SAM2 ocean analysis. The SST and sea ice analysis systems are currently 
stand-alone systems that each use persistence of the previous analysis as the background state to 
perform either an Optimal Interpolation (for SST) or 3D-Var (for sea ice concentration) analysis. 
Work will soon begin on migrating these analysis systems into the software framework of the 
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atmospheric 4D-EnVar. This is a preliminary step to facilitate future research on the estimation 
and impact of using cross-domain background error covariances in a more strongly coupled data 
assimilation context. 


Review of Group Breakout Sessions 


A series of breakout sessions were held to answer questions posed to the workshop participants. 
The breakout groups reported back during a daily plenary discussion at the close of each day. 
Some topics have general applicability to data assimilation, but all responses are focused on the 
implications specifically to CDA. 


Methods for CDA 


There is a need to build a body of evidence to demonstrate the benefits and drawbacks of various 
CDA methodologies. The field is still growing, in part due to the fact that experiments in CDA 
require a significant computational burden. A larger body of evidence is needed for 
simple-to-state tasks such as: (1) demonstrate that weakly coupled DA is better than uncoupled 
DA, and (2) demonstrate that strongly coupled DA is more effective than weakly coupled DA. 

Gary Brassington ~ 2015 white paper, Progress in Oceanography, illustration of benefits 
of various coupling as starting point (per Santha Akella). S/uka et al. (2016) indicate benefits of 
strong coupling versus weak coupling using an intermediate complexity coupled 
atmosphere/ocean model (i.e. SPEEDY/NEMO). 

The perceived benefits presume quality estimates of co-variability, which are available in 
perfect model scenarios but unclear in real-world applications. The question remains whether the 
coupled models are reasonably estimating these quantities. If not, it must be determined why this 
is so, and how the models can be improved to accurately reproduce a realistic cross-domain 
covariance. 

Many operational centers are not yet coupling weakly in their data assimilation efforts, so 
the most accessible starting point for CDA is to begin with WCDA with existing software (e.g., 
coupled forecasts in the outer loop of a 4DVar). There should be a focus on improving the 
weaker DA components, as these may unnecessarily degrade the quality of the coupled forecasts. 

CDA approaches like the interface solver (Frolov et al. 2016) are attractive as a starting 
point for coupled approaches, particularly for variational DA methods. Simplified approaches 
could be attractive if it is found that 10% of the effort might get 90% of the effect of CDA. An 
interface solver and similar approaches could facilitate gradual steps towards strongly coupled 
DA by operational centers, e.g. starting first with weakly coupled DA, then to one-way strong 
coupling, then to two-way strongly coupled DA. 

As strongly coupled DA research begins, many researchers are using different time 
windows for different state components. It is necessary for the research community to build a 
theoretical basis for the proper way to do this. 
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Longer-term vision: 


There is value seen in a unified database for all observations across state components, and 
possibly shared between organizations doing coupled DA research. The Joint Center for Satellite 
Data Assimilation (JCSDA) has initiated an effort to develop such a database in the U.S. A 
modular software infrastructure separating observation processing, forward operators, solvers 
would facilitate co-experimentation, particularly as the best methods for CDA are not yet 
established. The ECMWF effort to develop the Object Oriented Prediction System (OOPS) is 
one example of such a software infrastructure. A nascent U.S. effort is the Joint Effort for Data 
Assimilation Integration (JEDI) led by the JCSDA. 

Sea ice DA presents challenges in methods, observations, and timescales. Some 
inter-variable consistency is critical for sea ice prediction. For example, a simple SST constraint 
is needed where sea ice is introduced. More sophisticated non-Gaussian methods may provide 
value, e.g. when SST is near the freezing point temperature. 

More collaboration between universities and operational centers is needed. A significant 
challenge is the computational burden of large coupled models. Solutions must be developed to 
(a) give universities access to full scale operational models, potentially on operational/dev 
machines (i.e. O2R), and (b) develop acceptable pathways for new discoveries on simple coupled 
models to be translated to operations (i.e. R20). Demonstration of new methods and ideas in 
research labs and universities is a necessary precursor for acceptance and adoption in operational 
centers. 


Estimation of Forecast error covariances 


This session addressed several aspects of the forecast error covariance in the context of CDA. 
One goal of CDA is to estimate the dynamically changing state of the coupled model, given 
incomplete and noisy observations. The forecast error covariance matrix plays a crucial role in 
encoding information about dynamical relationships, separated spatially and across state 
variables. In the atmosphere and to some extent in the ocean, modeling the forecast error 
covariance matrix is facilitated by knowledge of physical balances and of dominant forecast 
errors (e.g. due to baroclinic instabilities). Modeling the cross-domain forecast error covariance 
in an Earth system model is much more challenging because of the wide-range of time scales that 
span the relevant phenomena that produce forecast errors. Time scales of relevant phenomena go 
from minutes (such as convection) to centuries and beyond (such as the deep ocean circulation). 
Small forecast errors associated with convection might be negligible for synoptic weather 
prediction but can produce large biases in SST in longer time integrations. What is considered 
noise and what is considered signal may become conflated, and even small biases on short 
timescales can create large model drifts. Another practical issue in determining the forecast error 


covariance from ensemble-based estimates is to identify error growth rates and error saturation levels of 
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each model component with coupling versus without. Uncoupled models are typically driven by external 


forcing, and in an ensemble setting such forcing fields can have a strong influence on the nature of the 


ensemble. 


A detailed description of the topics discussed are given next. 


Outstanding questions and challenges: 


Modeling the forecast error covariance matrix across distinct media. Modeling the 
forecast error covariance at the interface of two or more climate components is 
complicated when there is a large contrast in the time and spatial scales of their state 
variables evolution. For instance, in the ocean-atmosphere interface, estimating vertical 
error covariances requires a proper handling of localization across domains which may 
have different spatiotemporal scales. Optimal inflation parameters may also be 
inconsistent across domains and care must be taken to ensure they do not create new 
imbalances in the analysis. 

Moving from independent forecast error covariance matrices for each domain to a 
coupled forecast error covariance matrix for SCDA. The prescription of the forecast error 
covariance matrix is application dependent. For instance, some applications mean to 
include underrepresented processes (i.e. processes than can be represented by the model 
resolution but may not be present in the model without the use of DA), some applications 
explicitly exclude underrepresented processes (and treat them as representativeness 
error). The challenge is to identify or consolidate multiple applications into one single 
matrix. 

Fidelity of the forecast error covariance matrix. It was noted that in the context of earth 
system models the goal may either be “true-to-nature”’, attempting to resolve 
underrepresented processes (1.e. processes that can be resolvable by the model but may 
not be present without the use of DA), or “true-to-model”, in which only 
model-resolvable processes are allowed in the analysis. 

Knowledge of forecast error dynamics in coupled models is limited. Beyond the typical 
(initial condition) error dynamics known in the atmosphere and the ocean, there is a need 
to increase understanding of errors in coupled models. Errors strongly depend on the 
strength and type of the forcing. Errors are also seasonally dependent and vary depending 
on climate regime changes. It is clear that saturation levels change. 

Relevance and use of the coupled model’s climatological forecast error covariance matrix 
(B). Past work on generating the statistics of the climatological B matrix has focused 
primarily on independent components of the coupled system. Relatively little work has 
been done on constructing cross-domain climatological error covariance for coupled 
Earth systems, identifying what timescales must be resolved, or what localization must be 
applied. 
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e Missing authentic cross-interface observations. Estimating “true to nature” background 
error covariances from observations is complicated because observations are rarely 
co-located. More generally, the lack of observations of some of the model components 
produces large sampling errors in the computation of a cross-domain error covariance 
matrix. 

e Regime-dependent B matrix. Assuming that it is possible to adequately characterize 
regime-dependent forecast error covariances (e.g., during different phases of ENSO), it is 
still a questions as to whether CDA can gain an advantage by accounting for these 
climatological modes. 

e Static versus flow-dependent B. Is an inaccurate (never actually realized) but benign 
(low-impact / does not induce spurious features) static error covariance better or worse 
than an inaccurate and malignant (high-impact, induction of spurious features) but 
high-information content flow dependent covariance? 

e Hybrid B in Earth system models. Hybrid methods are designed to balance the 
weaknesses of variational methods and EnKFs to create more reliable and stable filters. 
Given the multiple spatiotemporal scales involved, it is unclear whether a traditionally 
constructed climatological forecast error covariance matrix can be incorporated into a 
hybrid system, as has been applied in atmospheric (Kleist 2012; Kleist et al., 2015) and 
oceanic (Penny et al., 2015) cases. 

e Revising frequency of B. Upgrading B requires a considerably amount of effort. 
Guidance should be given on how often the climatological forecast error covariance 
matrix should be revised. What forces control this decision? Is it too difficult, or is there 
too much user-resistance? How can it be determined if there is enough new information 
to justify the work effort? Can this process be automated or put into the model 
development workflow? Should the climatological forecast error covariance matrix be 
recomputed every time model versions change, and/or when model configurations (e.g. 
resolution) change? 

e Modeling model errors. Will stochastic perturbations, or model parameter perturbations 
be able to represent well enough model errors over a wide range of time scales? Is there 
value in using multi-model ensembles to represent model errors? 


Recommendations 

e There is a need to characterize forecast errors in fully coupled models. Understanding the 
growth rates of each model component with or without coupling and understand 
seasonality. 

e Due to the numerous spatiotemporal scales present in the coupled earth system, it may be 
advantageous to explore ideas for merging covariance information from different time 
and space scales. For example, signals representing large versus local spatial scales can 
be separated in the forecast error covariance matrix. One could prescribe a set of error 
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covariance matrices (as in a hybrid covariance DA method), or a set of gain matrices (as 

in the Hybrid-Gain method of Penny, 2014), and then estimate the optimal weights to 

combine these at any given time. This may also require separating observations into 

“large-scale” vs. “small-scale” signals as well as performed by Tardiff et al. (2014/2015). 

e Amore thorough treatment of the model forecast error covariance should be pursued. 

Corrections based on a more sophisticated understanding of model uncertainty (e.g. 

stochastic parameterizations, topography/bathymetry perturbations, surface flux / 

boundary forcing ensembles) should be explored. 

Model Error and Model BiasCoupled DA provides an opportunity to improve estimates 
of coupling parameters in the models. For example, the flux quantities can be corrected based on 
observations made from the corresponding domains that define the flux. As another example, 
parameters such as (ice) drag coefficients that are typically specified in advance can be made 
adaptive during CDA. This also opens the possibility for uncertainty quantification for parameter 
estimation of coupling parameters. 


Feedback from the model error terms to assist in quality control (do not reject obs. when model 
error is high) 


Relative to the atmosphere, the fully coupled earth system has proportionally more 
regions of poor to non-existent observational coverage. The more poorly observed regions of the 
earth system are most likely to accumulate bias. There is a concern of the transfer of biases in 
coupled models. In other words, an atmospheric bias in temperature could impact water mass 
formation in the ocean and create a long-term bias in ocean circulation. Identifying such biases 
early, at their source, is an important application of CDA in model evaluation. It should be noted 
that biases in a coupled model need not be constrained to the boundary layers - for example 
cloud cover has impacts on mixed layer depth in the ocean and upper ocean heat content. 

Some are using assimilation of anomalies as a remedy to bias transfer, though it is 
unclear whether this is an appropriate long-term solution. A suggestion from the UK MetOffice 
is to use 3D analysis increments as proxies to bias and model error terms for long forecasts. For 
ensemble-based methods, there is a danger that model bias can have a detrimental effect on 
ensemble spread (e.g. by reducing spread even where uncertainty is high). 

Variational bias correction is widely used for radiances, however the growing interest in 
ensemble methods for CDA may necessitate development of varBC equivalent in ensemble 
methods, such as that of Fertig et al. (2009). There are a number of application of ensemble 
methods for bias estimation, model error estimation, and parameter estimation. Ensembles have 
been used to provide prior estimates for leading model error and bias terms using a reliability 
budget estimate (Rodwell et al., 2015). In nonlinear forecasts performed at operational centers 
with ensemble data assimilation, the reliability of these forecasts are a key attribute to evaluate 
the ensemble performance. Analyzing the analysis increments from the CDA in such an 
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ensemble system as well as computing reliability budgets (Rodwell et al., 2015) to inform on 
model errors in a coupled system would be a very useful diagnostic tool for model improvement 
and stochastic parameterization. 

There is a consensus that a need exists for more research in stochastic physics and 
parameterizations, which serve as a physically-based replacement to simple inflation techniques 
often applied in ensemble methods. There is limited research at this time studying the impact of 
stochastic physics on coupled models (for an example, see Vialard et al., 2005). For example, 
how does the ocean respond to stochastic parameterizations in the atmosphere? Little work has 
been done exploring stochastic parameterization in the non-atmospheric components. Studies 
using stochastic perturbations for ocean mixing parameterization in idealized experiments (Mana 
and Zanna, 2014; Grooms et al., 2015; Jansen and Held, 2014; Brankart, 2013; Majda, ...) and 
seasonal forecasts (Andrejczuk et al., 2016) show improvements in ensemble reliability both for 
atmospheric and ocean variables as well as improved ocean variability in the idealized models. 

Experiments with model uncertainty in the land surface model represented using different 
stochastic methods were performed with the ECMWFE seasonal forecasting system (Cloke et al., 
2011; Macleod et al., 2016). These experiments show that improvements in ensemble forecast 
reliability of surface temperature can be achieved with better representation of model uncertainty 
in the land surface scheme. This can be extended further to CDA techniques where the 
uncertainty in coupled model components such as land and atmospheric boundary layer or 
convection scheme can be coupled in the cross-covariance matrices. 

Stochastic sea-ice strength parameterization has been shown to influence and improve 
model mean climate variability of sea-ice distribution as well as improve forecast of sea-ice area 
on a seasonal timescale (Juricke et al., 2014a, b) in a coupled atmosphere-ocean-sea-ice model. 
These representations of model uncertainty in sea-ice models can be combined with the 
representations of uncertainties in other model components in a CDA framework to incorporate 
appropriate uncertainty representation for the analysis. 


It is acknowledged that there is uncertainty in the coupling terms. It would be advantageous to 
estimate the order of magnitude of errors in the coupler itself. 


CDA can serve as a valuable component of process study field experiments and observation 
network to characterize model error in coupled models and quickly identify flawed observations. 


Simple Models for Studying CDA 


What kinds of simple models should be used to study and evaluate CDA? The widely-used 
Lorenz models (L63, L96) are good for training purposes, however it is unclear whether 
conclusions drawn from coupling such simple models can be reliably extended to more realistic 
coupled modeling environments. For research purposes, it would be advantageous to have a 
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series of increasing complexity 1D, 2D, and simple 3D coupled models. The specific processes 
that are represented depends on the research question being explored. Evaluation of new methods 
must be tested from a series of simple to more complex models, while providing the ability to 
quantify the impact on each individual component. 

Simplified models may be useful both for forecasting and for process understanding, with 
a range of complexity demonstrated in work done by Lawless et al., Tardif et al. (2014, 2015), 
Bishop et al. (2016), Sluka et al. (2016). Examples of simplified coupled models include the 
Lorenz system 2 (Lorenz 1996), the Pena and Kalnay (2004) coupled modified Lorenz system, a 
multi-layer Lorenz-96-type scheme by Bishop et al. (2016), the Lorenz wave-mean-flow 
atmospheric model coupled to a Stommel-type box ocean model developed by Roebber (1995) 
as used by Tardif et al. (2014, 2015) and low order models developed by Vannitsem et al., 
(2015). As a step up in complexity, a coupled 2-layer QG atmosphere with a 1-layer shallow 
water equation system was analyzed by Vannitsem and Lucarini (2016). The SPEEDY/NEMO 
model (Kucharski et al., 2015), as used by Sluka et al. (2016) for SCDA is a more sophisticated 
coupled model that begins to approach the level of operational model complexity. 

Some frameworks already exist for applying DA to a variety of simplified models, for 
example: the Data Assimilation Research Testbed (DART), the Object Oriented Prediction 
System (OOPS), the Parallel Data Assimilation Framework (PDAF), EMPIRE. 


What about var schemes? 


The world CDA research community must work on bridging the gap between simplified 
academic research studies and full scale operational applications. This will require two-way 
communication between the interested parties. NOAA’s numerous testbeds 
(http://www.testbeds.noaa.gov/) provide an example framework for bridging the ‘Research to 
Operations’ (R2O) and ‘Operations to Research’ (O2R) gaps. 


The panel recommends a set of benchmark problems and datasets that are suitable for the 
testing and evaluation of a hierarchy of coupled models, e.g. SST at a specific time and place. 


Coupled Initialization and Prediction 
While there is general agreement that model quality is the first-order problem for CDA, a 
number of other challenges have been identified that deserve attention. 

Background (forecast) moments used in a cycled DA system may be more faithfully 
represented in a coupled system, either using weakly or strongly coupled DA. Phenomena that 
would likely benefit from CDA include: precipitation/SST feedback effects, the MJO, the 
AMOC and NAO, Tropical Cyclones, ENSO, near surface atmosphere and land surface 
interactions, polar prediction (including sea ice predictions). 
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Including land processes could allow surface moisture and snow observations to impact the 
atmosphere. A better representation of the land surface would also allow for better use of 
near-surface radiance channels from satellite observations. 

It has been noted there is a degradation of coupled-prediction performance due to a lack 
of balance in the initialized variables (“shock”). For example, the ENSO problem, where an 
imbalance between the wind stress (surface pressure) and stratification (thermocline) will lead to 
systematic increments in the ocean and the generation of spurious vertical velocities. Another 
example is the sea-ice prediction problem - because the sea-ice is in a sensitive equilibrium with 
ocean and atmosphere, imbalances can produce spurious melting/freezing. Yet another example 
is near-surface, near-shore ocean processes, where a lack of correction to the atmospheric fields 
will lead to a need to continuously re-update the ocean (will loose the information if the system 
is highly slaved). It is anticipated that these types of problems can be improved with strongly 
coupled DA. 

CDA can provide a means of improving the suboptimal use of observational data of the 
earth system. Use of marine surface data to influence ocean (especially historically- pre satellite) 
e.g. SST/surface pressure has a relatively long-record — coupled assimilation allow this to be 
used for initialization pre-1960 (e.g. for decadal prediction). Including the land processes could 
allow surface moisture and snow observations to impact the atmosphere. Parameters for the 
boundary can be conceptualized as another “model state” and optimized (e.g. emission sources 
within a chemical transport model). It is also anticipated these problems can be improved with 
strongly coupled DA. 

There are a number of challenges that must be overcome. First, when component models 
are coupled, biases may compound and lead to more rejection of observations. Under some 
circumstances, single component assimilation may be a better choice if high-quality boundary 
forcing data is available (i.e. of better quality than you can achieve with a particular coupled 
model). 

The ‘best’ analysis does not necessarily lead to the most accurate prediction. Different 
methods may be better suited for initialization of forecast models and reanalysis, for example. 
CDA may in principle be designed to control error growth on instabilities related to 
low-frequency variability; those that gives raise to longer term forecast skill. Such long-range 
modes are genuine results of the coupling. In principle, having a kind of approximate estimate of 
their structure, CDA can help reducing error along them and thus improving long term prediction 
skill. 

There are also a number of practical challenges specific to operational centers. The 
background errors may need to be re-characterized when moving from a single component 
background to a coupled background. Operational CDA may require building in new 
data-streams, assimilation systems, and standard operating procedures for the new components to 
support real-time prediction. 


31 


Decadal prediction offers additional challenges. What are the best methods to produce 
historical reanalysis for initialization of decadal prediction (given a long reanalysis is needed for 
drift correction and retrospective prediction verification). The observational network changes on 
the same time and space scales as the signal in which we are interested. Can we develop ways of 
initializing/assimilating that are focused the target decadal scale variability? 


Observing System 


There is an overall need for greater observations of the non-atmospheric components of the earth 
system. For CDA, obvious needs are surface measurements - for example soil moisture, snow, 
improved fidelity of Sea Surface Salinity (SSS). The continuity of missions already measuring 
surface quantities should be ensured, and expansion to all cross-domain interfaces is 
recommended. 

Flux measurements are important and should be increased. This includes not only air/sea 
and land/air fluxes, but also fluxes between earth system components such as sea-ice/ocean, 
land/sea (e.g. river runoff, calving, etc.), and all other inter-component interfaces. 

As one of the more under-observed components, there is a strong demand for more 
observations of sea-ice in the polar regions. That includes, for example, sea-ice thickness, 
surface temperatures, surface snow, and ocean measurements under sea-ice. 

As the non-atmospheric components become integral to real-time NWP by their use in 
CDA, a faster and more organized delivery system will be necessary for observations of the 
non-atmospheric domains. This was noted particularly for ocean observations, but applies to all 
other domains as well. This need also includes notices of data dissemination changes that follow 
NWP operational data standards (e.g. an example of a short-notice change was the transition 
from Jason-2 to Jason-3 which gave only 13-days notice). 

Additional attention should be applied to the estimation of observation errors. In 
particular, there is a need for co-located observations (i.e. multiple observations made at the 
same lon/lat coordinate in both the atmosphere and underlying domain). Such co-located obs 
would help validate cross-covariance estimates between domains. 

There is a need for observing systems that can better resolve the diurnal cycle. This is of 
particular interest to CDA because this has a large impact on the boundary layers and uncertainty 
in the state estimate in that boundary. 

Deep ocean observations are needed to help identify and diagnose the sources of 
long-term coupled model drift, particularly in climate and decadal prediction applications. 


Coupled Observation Operators 


The application of DA to coupled systems opens the opportunity to utilize observation operators 
that require inputs from multiple domains. The most obvious of these are satellite measurements, 
which often compute integrated quantities within the satellite’s field of view. Examples include 
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radiance measurements impacted by surface emission and aerosols, gravity measurements 
impacted by total mass in the column from the top of the atmosphere to the bottom of the ocean 
or land hydrography, or satellite altimetry impacted by steric sea surface height and wave 
roughness. Many of the retrieved products that use these data depend on climatologies or model 
state estimates derived from inconsistent forecasting systems. CDA present the possibility for a 
new age in observation operator accuracy by including up-to-date and consistent coupled 
forecast state estimates in every part of the obs operator computation. 

Observation types identified with potential benefit to be addressed with coupled 
observation operators include: sea surface temperature (atmos/ocean), Scatterometer winds, 
altimeter sea surface height, ocean color, gravity e.g. GRACE/GOCE (atmos/ocean, atmos/land, 
atmos/ice), SAR (ice/wave, ocean/wave), sea surface salinity (atmos/ocean), soil moisture 
(atmos/land), land surface temperature (atmos/land), snow cover (atmos/land, atmos/ice). 

Development of radiative transfer models to represent radiances for surface salinity and 
surface temperature must be improved. Instrument error estimates would ideally be provided for 
each. The uncertainty associated with the observation operator itself is also important 
information that should be provided. 

Coupled observation operators designed for radiance-based and other types of 
satellite-based measurements magnify the need for coupled atmos/aerosol DA. 


Surface Temperature: 

In particular, discussion focused on surface temperature as an ideal starting point for 
addressing coupled observation operators and DA at the interface of many domains. Surface 
temperature experiences diurnal variability. For the ocean, SST products are often averaged daily 
thus hiding this diurnal cycle. CDA applied with models that represent this diurnal cycle requires 
the diurnal variability to be represented in the observations as well. In the ocean, this diurnal 
variability is not restricted to the surface alone. While there is a clear diurnal cycle in the argo 
profiles down to 4m, it has been observed that the diurnal variability can get as deep as 20-30m 
in the equatorial Atlantic. The current generation of ALE ocean models (e.g. GFDL’s MOM6) 
can represent thin layers in the upper ocean and may prove useful in assimilating near surface 
temperature measurements. However, current efforts are utilizing additional Near Surface Sea 
Temperature (NSST) profile models to project ocean foundation temperature to an estimated 
ocean skin temperature (effectively acting as the ocean side a coupled observation operator) 
which can then be input into a radiative transfer model (completing the coupled observation 
operator). 


Do ocean and diurnal model need to be coupled? 


The land surface models have a diurnal skin layer, which typically has a longer-term 
memory. However, the model representation isn’t realistic enough to what is seen in TIR 
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observations. In general, the models generate realistic fluxes from their defined LST, but 
typically have long wave biases. There is a need for modelers to introduce LST that is more 
representative of a true skin layer in the land model. An alternative would be to introduce 
diagnostic LST, either physics based or regression. Models are beginning to be introduced that 
have dynamic vegetation, with separate surface temperatures for the canopy and soil, which will 
improve the realism of modeled LST. There are thermal sensors like modis, and microwave 
sensors that measure brightness temperature (more related to soil moisture). However, when 
assimilating modis LST, the measurement is not exactly surface or skin temperature. It can go 
through the canopy. 

Sea-ice has melt ponds; they sit on top of sea-ice. The models don’t adequately resolve 
these features, though they may be parameterized (e.g. in CICE). This creates a notable bias, 
particularly in the summer months. 

Should we only assimilate Tb, retrievals, or a combination of the two? There are 
difficulties with correlations in observations errors that have not been adequately resolved in 
general for data assimilation applications. 

For ensemble methods, the localization radius is highly dependent on ensemble size. 
There are also latitudinal dependencies. We must assume autocorrelation will be larger than 
spurious noise. (Francois) 

It is possible to apply a different radius to SST observations versus in situ profiles. It is a 
common approach to manually tune observation errors to adjust data assimilation system 
performance. As system complexity and observing network size increases, this approach will 
likely prove inadequate and should be replaced by automated methods for accurate estimation of 
observation errors. 


Francois - we had to artificially increase the SST error estimates 
Sea ice temperature retrieval 
What are the next big challenges for surface obs? 

Coupled observation operators must be developed until the direct assimilation produces 
equivalent or better results than using precomputed retrieval products. Representation of the 
diurnal cycle is insufficient. Localization is a challenge. Specification of observation errors 
requires more attention. The assimilation of individual obs versus super-observations must be 


assessed (e.g. for the Navy, super-ob’ing depends on the Rossby radius and is based on water 
type - i.e. T/S/density). Sea-ice temperature is a challenge. 
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Software and Hardware Issues 


General trends in software design are encouraged for the development of future CDA tools, 
including: modularity of code, managing code complexity, efficient/optimal use of emerging 
hardware. Trends in hardware include accelerators, many-core processors, reduced memory 
bandwidth, NV RAM (non-volatile RAM)/”burst” memory/arrays of SSDs. 

Active management (i.e. minimization) of communication within the DA system, across 
the coupled DA, and with the coupled model is encouraged. This includes a reduction of 
‘transposes’. A common bottleneck in DA applications (especially ensemble-based methods) is 
I/O; implementation strategies that minimize I/O are preferable. 

There is a need to actively increase DA community understanding of how emerging 
technologies (e.g. computer science, statistics, software, visualization, data science, big data, and 
data mining) map onto the objectives of CDA. Greater interaction with these external 
communities should be encouraged. The needs of CDA should be communicated to these 
communities. The DA community should begin reaching out to advertise open problems that 
may have technological solutions. Common data formats and APIs should be established and 
made available to these communities. It may be advantageous to selectively invite members of 
these communities to future CDA meetings. 

Much of the existing DA code may need to be refactored to make use of new hardware. 
Due to the legacy constraints of many existing centers that has led to the adoption of 
weakly-coupled DA techniques, this may actually provide an opportunity to build versatile and 
appropriate CDA code that can perform a spectrum of weakly to strongly coupled DA methods. 
Such a refactoring further provides an opportunity for an international standardization of 
modular software for CDA that promotes intercomparison of results, increases efficiency of 
scientific research, and promotes R20 and O2R. Given these recommendations, it is important to 
find as models existing efforts that have been successful in establishing a common framework 
and staying up-to-date with new software and hardware developments. 


The priorities in such an international collaboration for software standardization are: 

1) Databases/repositories storing observations across all domains with standardized data 
formats. Investigation of commercial big-data supporting applications such as Google 
BigTable, or other open-source noSQL options may be useful. 

2) Development of standardized forward operators for key variables. In particular, for CDA 
this applies to operators dependent on multiple domains. Such standardization is 
especially valuable for complex operators (e.g. Radiative Transfer Models for radiance 
assimilation). 
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3) Standardized output formats to support DA diagnostics. This would support both tools 
that could be run integrated within models and DA systems, as well as tools that might be 
run offline. 

4) A common infrastructure to apply the most common DA methods (e.g. either 
gridpoint-based or simultaneous global solutions). Flexibility is encouraged to avoid 
preventing new solution methods from being viable on such a platform. 


Feedback is encouraged to HPC infrastructure designers. 


The CDA community must define specific “use-cases” that are representative of the work 
we expect to be doing in the next 5-10 years. This might include an EnKF CDA use case (limited 
by communication, ensemble simulation), as well as a Variational CDA case (limited by running 
the adjoint, performing a global minimization), and hybrid use cases. Different constraints may 
be applicable for real-time operational forecasts versus offline reanalyses. To collaborate with 
other fields and industries, use cases should be provided for: minimization algorithms, 
visualization methods and tools, database design and accessibility for managing observations, 
NWP ‘forecast challenge’ that can be approached by both data scientists and NWP. 

As ensembles are likely to be used for the construction of the background error 
covariance any of the CDA solution methods, the potential for ‘intrinsic ensemble models’ (i.e. 
adding a 5th ensemble dimension to the model) may provide opportunity for optimization. This 
would require a refactoring of many models, so would require a clear demonstration as to the 
benefits of such an approach. It is likely that there would at least be potential savings in shared 
‘meta-data’ across all members (e.g. grid data), reduced overhead costs for initializing the model, 
and potential coordination between model and DA parallelization schemes. This could lead to a 
transformative change for ensemble modeling (prediction/simulation/projection) and should be 
explored, if cautiously. 


Metrics and Diagnostics 


The Met Office has had success using salinity to find issues with the coupled model. 
<I need someone with notes from this breakout session to include them here> 


Coupled Land/Atmosphere Perspectives 
(P. Rosnay, C. Draper, J-F. Mahfouf) 


For several decades, most NWP centers have been running weakly coupled land and atmosphere 
data assimilation, using coupled model for the background forecast, assimilating low-level 
atmospheric observations to update the land surface soil moisture and temperatures, while also 
often running a separate snow (cover and depth) analysis. The use of screen-level observations 
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has been very successful in improving low-level atmospheric forecasts in regions that are well 
observed, however its impact on the land surface states is more mixed. 

Future developments will be focused on enhancing the coupling between land and 
atmospheric data assimilation, and assimilating more remotely sensed land surface observations. 
Based on current capabilities, stronger coupling of land and atmosphere DA will most easily be 
obtained through ensemble approaches. Some of the significant remaining challenges include 
estimating model error covariances across the land / atmosphere interface (through ensemble 
methods, or otherwise), differences in horizontal resolution between the (courser) atmospheric 
DA, the (even coarser) ensembles in atmospheric hybrid systems, and (full resolution) land DA, 
and the potential for unknown, and often large, land surface biases to be passed to the 
atmosphere. 

Development should focus on improving both the land and atmosphere through coupled 
DA, rather than updating one at the possible expense of the other. This can be achieved through 
assimilation observations informative of both atmosphere and surface states, assimilating a range 
of observation types selected to constrain different aspects of the model physics (moisture / 
energy), development of diagnostics targeting coupled DA, and strong support from model 
development. The coupled assimilation could itself provide a breakthrough in diagnosing land 
surface biases (and split random and systematic errors in the observations and forecast). 

The development of modular platforms, such as EMPIRE, OOPS, and JEDI is a key 
component of coupled assimilation developments. These platforms will support research 
activities required to explore the diversity of coupling strengths ranging from weakly data 
assimilation (coupled background, but separate analyses) to fully coupled approaches (a single 
cost function and a control vector that includes increments for all models). Likewise, the 
introduction of common data formats, observations databases, and I/O etc for each of the land 
and atmosphere components will speed development of coupled land/atmosphere DA and enable 
operational implementation for NWP applications. 

In terms of the network of observations for assimilation, continuity beyond SMOS and 
SMAP, of L-band missions for sensing soil moisture is particularly important, and these 
follow-on L-band missions should provide data in NRT. Additionally, there is an urgent need for 
remotely sensed snow mass missions, as this is a major gap with the current observing systems. 
From an evaluation perspective, there is need for more in situ and global gridded observations 
(snow, SM, fluxes), and development of methods to up-scaling in situ observations to model 
scales. For evaluation of coupled DA, the development of coupled diagnostics, and associated 
observations, is required. 


Coupled DA for Reanalysis 


What are the challenges and priorities from a reanalysis perspective? 
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The ERA-CLIM2 project has a number of key activities: rescue observations, R&D, 
reanalysis production, and reanalysis assessment. Within the ERA-CLIM2 project, there have 
been identified apparent coupled model biases. Fewer data are available to constrain the system 
in the early period. There are drifts and jumps in the stratosphere, deep ocean, and sea ice. A 
challenge for strongly coupled DA is whether it will only have a positive impact, or whether it 
could potentially transfer biases from one part of the system to another. 

Research is needed to improve the coupled model. Related, is a need for dedicated bias 
correction schemes. There are fewer constraints in the coupled system to compensate for biases 
in individual components. This leads to a recommendation to encourage intercomparisons of 
biases and drifts in different coupled reanalyses. 

Changes in the observing system over time are a notable challenge. Current approaches to 
assimilate only surface pressure (atmos) and SST (ocean) are meant to address this, but better 
ways should be explored. Quality control is needed, especially with sparse observation data, or a 
suddenly newly observed area or variable type. 

There is a need for flexibility in the representation of multiple spatial scales in the 
background error covariances. Better assimilation at the air-sea interface is needed, particularly 
to separate coupled interactions from biases. ECMWFE currently uses nudging to SST. Weights 
must be specified, and this parameter is not easy to tune. They would like to shift from nudging 
to SST assimilation as the Met Office is attempting. 

Spin up and initialization of multiple streams is a challenge in coupled systems due to the 
many timescales present in the different components of the system. Ocean and sea-ice 
initialization at the start of the century (where data is very sparse) and also at the beginning of 
each stream is a challenge, both in determining the state itself as well as the uncertainty. 
Assessment is difficult because of the multiple components. Visualization and standardized 
diagnostics are needed. CDA should provide feedback on coupled model biases (e.g. estimated 
via analysis increments). 

A coupled reanalysis ensemble may be needed for flow-dependent covariance estimation 
and uncertainty estimation. Spurious climate signals and trends are exacerbated by model and 
observation bias. Novel observation types should be considered. These might include tracer 
observations, bottom pressure, tide gauges, and various climate proxy data such as tree ring 
widths and isotope ratios from corals and ice cores. 
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